7.22.1 [10] <7.11>In future systems, we ex Computer Organizations and Design - Patterson

7.22.1 [10] <7.11> In future systems, we expect to see heterogeneous computing platforms constructed out of heterogeneous CPUs. We have begun to see some appear in the embedded processing market in systems that contain both floating point DSPs and a microcontroller CPUs in a multichip module package. Assume that you have three classes of CPU: CPU A—A moderate speed multicore CPU (with a floating point unit) that can execute multiple instructions per cycle. CPU B—A fast singlecore integer CPU (i.e., no floating point unit) that can execute a single instruction per cycle. CPU C—A slow vector CPU (with floating point capability) that can execute multiple copies of the same instruction per cycle. Assume that our processors run at the following frequencies: CPU A can execute 2 instructions per cycle, CPU B can execute 1 instruction per cycle, and CPU C can execute 8 instructions (though the same instruction) per cycle. Assume all operations can complete execution in a single cycle of latency without any hazards. All three CPUs have the ability to perform integer arithmetic, though CPU B cannot perform floating point arithmetic. CPU A and B have an instruction set similar to a MIPS processor. CPU C can only perform floating point add and subtract operations, as well as memory loads and stores. Assume all CPUs have access to shared memory and that synchronization has zero cost. The task at hand is to compare two matrices X and Y that each contain 1024 × 1024 floating point elements. The output should be a count of the number indices where the value in X was larger or equal to the value in Y. Describe how you would partition the problem on the 3 different CPUs to obtain the best performance.


View solution: $5 USD. View Solution



<< Back	Next >>

Solupals - Textbook Solutions Computer Organizations and Design - Patterson - 4ed